Pagination Documentation
This document provides comprehensive documentation for all pagination types supported by the pipeline generator ingestion system.
Table of Contents
Overview
The pagination system supports multiple pagination strategies commonly used by REST APIs. Each pagination type is implemented as a handler class that extends the base PaginationHandler abstract class. The system uses a factory pattern (PaginationFactory) to instantiate the appropriate pagination handler based on the pagination object configuration.
Decision Guide
Quick Comparison Table
| Feature | Default | PaginationWithTotalPages | PageOffset | Offset | Cursor | NextPageUrl |
|---|---|---|---|---|---|---|
| Termination Method | Single request | Total pages count | Result count | Result count | Result count or cursor | Next URL presence |
| Page Parameter | None | Page number | Page number | Offset value | Cursor token | Cursor from URL |
| Limit Parameter | N/A | No | Optional | Required | Optional | Required |
| API Provides Total Pages? | N/A | ✅ Yes | ❌ No | ❌ No | ❌ No | ❌ No |
| Use Case | No pagination | APIs with page count | Page-based without count | Offset-based | Token-based | URL-based |
| Example APIs | Single response APIs | MyContactCenter | Most REST APIs | SQL-style APIs | GraphQL, Twitter | Relay connections |
| Request Pattern | GET /api | ?page=1, ?page=2... | ?page=1&limit=50 | ?offset=0&limit=100 | ?cursor=abc | ?page[after]=xyz |
| Stops When | After 1 request | current_page > totalPages | results < maxEntries | results < maxEntries | no cursor or results < maxEntries | no next URL |
Decision Flowchart
Does the API use pagination?
│
├─ NO → Use "default"
│
└─ YES → How does the API indicate the next page?
│
├─ Provides complete next page URL?
│ └─ YES → Use "nextPageUrl"
│
├─ Uses cursor/token for next page?
│ └─ YES → Use "cursor"
│
└─ Uses page numbers or offset?
│
├─ Uses page numbers (1, 2, 3...)?
│ │
│ ├─ API provides total pages count?
│ │ ├─ YES → Use "PaginationWithTotalPages"
│ │ └─ NO → Use "pageOffset"
│ │
│ └─ Uses offset values (0, 100, 200...)?
│ └─ Use "offset"
Key Differences: PaginationWithTotalPages vs PageOffset
This is the most common confusion point. Here's the key difference:
| Aspect | PaginationWithTotalPages | PageOffset |
|---|---|---|
| Termination Logic | current_page > totalPages | len(results) < maxEntries |
Requires totalPages field | ✅ Yes (required) | ❌ No |
| Checks result count | ❌ No | ✅ Yes |
| API must provide page count | ✅ Yes | ❌ No |
| More efficient | ✅ Yes (knows end point) | ⚠️ No (must check each page) |
PaginationWithTotalPages:
- Use when API explicitly provides total page count in response (header or body)
- Stops based on total pages:
current_page > totalPages - Example:
X-Pagination: {"TotalPages": 10, "CurrentPage": 1}
PageOffset:
- Use when API uses page numbers but does NOT provide total page count
- Stops when returned results are less than expected:
results < maxEntriesAllowedInPage - Example: Returns 50 items per page, then 30 on last page → stops
Common Concepts
Pagination Object Structure
All pagination objects share common fields:
type(required): The pagination type identifierresultsPath(required): Runtime expression path to extract results from the responsemaxEntriesAllowedInPage(optional): Maximum number of entries per page/request
Wildcard Pagination
You can use wildcards in resultsPath:
# Simple wildcard
resultsPath: "$response.body#/*"
# Nested wildcard
resultsPath: "$response.body#/data/*"
# Alternative syntax
resultsPath: "$response.body#/{auto}"
Runtime Expression Syntax
Runtime expressions use the following format:
$response.<source>.<reference>#/<json_path>
Sources:
header: Extract from HTTP response headersbody: Extract from HTTP response bodyquery: Extract from query parameters (future use)path: Extract from path parameters (future use)
Examples:
$response.body#/items- Extract items array from response body$response.header.pagination#/TotalPages- Extract TotalPages from pagination header (JSON)$response.header.X-Total-Count- Extract total count from header (string)
Note: The # symbol indicates that the header value is a JSON object. If # is not present, the header is treated as a plain string.
Pagination Types
1. Default Pagination
Type Identifier: default
Use Case: APIs that return all data in a single response without pagination.
Configuration:
type: "default"
resultsPath: "$response.body#/items"
maxEntriesAllowedInPage: 100
Behavior:
- Makes exactly one request
- Returns empty parameter map (no pagination parameters)
- Terminates after the first response
2. Pagination with Total Pages
Type Identifier: PaginationWithTotalPages
Use Case: APIs like MyContactCenter that return total pages in response headers or body.
Configuration:
type: "PaginationWithTotalPages"
pageOffsetParam: "Page"
totalPages: "$response.header.pagination#/TotalPages"
resultsPath: "$response.body#/contacts"
maxEntriesAllowedInPage: 100
Fields:
pageOffsetParam(required): Query parameter name for page number (e.g., "page", "Page", "pageNumber")totalPages(required): Runtime expression to extract total pages count from responseresultsPath(required): Runtime expression to extract results array from responsemaxEntriesAllowedInPage: Maximum entries per page (informational, not enforced)
Behavior:
- Starts at page 1
- Increments page number for each request
- Extracts total pages from response (header or body)
- Terminates when current page exceeds total pages
Request Sequence:
GET /api/contacts?Page=1→ Response:TotalPages: 10GET /api/contacts?Page=2GET /api/contacts?Page=3- ... continues until page > 10
3. Page Offset Pagination
Type Identifier: pageOffset
Use Case: APIs that use page-based pagination with optional page size limits but don't provide total page count.
Configuration:
type: "pageOffset"
pageOffsetParam: "page"
limitParam: "per_page" # Optional, can be empty string
resultsPath: "$response.body#/items"
maxEntriesAllowedInPage: 100
Fields:
pageOffsetParam(required): Query parameter name for page numberlimitParam(optional): Query parameter name for page size limit (can be empty string if not used)resultsPath(required): Runtime expression to extract results array from responsemaxEntriesAllowedInPage(required): Maximum entries per page
Behavior:
- Starts at page 1
- Increments page number for each request
- Includes
limitParamin request if specified - Terminates when returned results count <
maxEntriesAllowedInPage
Request Sequence:
GET /api/users?page=1&per_page=50→ Returns 50 itemsGET /api/users?page=2&per_page=50→ Returns 50 itemsGET /api/users?page=3&per_page=50→ Returns 30 items (< 50) → STOPS
4. Offset Pagination
Type Identifier: offset
Use Case: APIs that use offset/limit pagination (SQL LIMIT/OFFSET style).
Configuration:
type: "offset"
offsetParam: "offset"
limitParam: "pageSize"
resultsPath: "$response.body#/items"
maxEntriesAllowedInPage: 100
Fields:
offsetParam(required): Query parameter name for offset valuelimitParam(required): Query parameter name for page size limitresultsPath(required): Runtime expression to extract results array from responsemaxEntriesAllowedInPage(required): Number of entries per page (also used as limit value)
Behavior:
- Starts at offset 0
- Increments offset by
maxEntriesAllowedInPagefor each request - Terminates when returned results count <
maxEntriesAllowedInPage
Request Sequence:
GET /api/users?offset=0&limit=100GET /api/users?offset=100&limit=100GET /api/users?offset=200&limit=100- ... continues until results count < 100
5. Cursor Pagination
Type Identifier: cursor
Use Case: APIs that use cursor/token-based pagination (e.g., GraphQL-style, Twitter API).
Configuration:
type: "cursor"
cursorParam: "cursor"
limitParam: "pageSize" # Optional
resultsPath: "$response.body#/items"
cursorPath: "$response.body#/pagination/nextCursor"
maxEntriesAllowedInPage: 100
Fields:
cursorParam(required): Query parameter name for cursor valuelimitParam(optional): Query parameter name for page size limit (can be omitted)resultsPath(required): Runtime expression to extract results array from responsecursorPath(required): Runtime expression to extract next cursor value from responsemaxEntriesAllowedInPage(required): Maximum entries per page
Behavior:
- Starts with no cursor (or empty cursor)
- Extracts cursor from response using
cursorPath - Uses extracted cursor for next request
- Terminates when no cursor is returned or results count <
maxEntriesAllowedInPage
Request Sequence:
GET /api/users?first=50GET /api/users?after=eyJpZCI6IjEyMzQ1In0&first=50GET /api/users?after=eyJpZCI6IjY3ODkwIn0&first=50- ... continues until no cursor returned
6. Next Page URL Pagination
Type Identifier: nextPageUrl
Use Case: APIs like GraphQL Relay connections or REST APIs that return complete URL for the next page.
Configuration:
type: "nextPageUrl"
nextUrlPath: "$response.body#/pagination/nextPageUrl"
cursorParam: "page[after]"
limitParam: "pageSize"
resultsPath: "$response.body#/items"
maxEntriesAllowedInPage: 100
Fields:
nextUrlPath(required): Runtime expression to extract next page URL from responsecursorParam(required): Query parameter name to extract from the next URL (e.g., "page[after]", "cursor")limitParam(required): Query parameter name for page size limitresultsPath(required): Runtime expression to extract results array from responsemaxEntriesAllowedInPage(required): Maximum entries per page
Behavior:
- First request includes only
limitParam - Extracts next page URL from response using
nextUrlPath - Parses cursor value from URL query parameters using
cursorParam - Uses extracted cursor for subsequent requests
- Terminates when no next URL is returned
Request Sequence:
GET /api/users?pageSize=50- Response contains:
{"links": {"next": "/api/users?pageSize=50&page[after]=abc123"}} GET /api/users?pageSize=50&page[after]=abc123- Response contains:
{"links": {"next": "/api/users?pageSize=50&page[after]=xyz789"}} GET /api/users?pageSize=50&page[after]=xyz789- ... continues until no next URL returned
Examples
Example 1: Zoho Books Contacts API (Page Offset)
type: "pageOffset"
pageOffsetParam: "page"
limitParam: "per_page"
resultsPath: "$response.body#/contacts"
maxEntriesAllowedInPage: 200
Example 2: GraphQL-style Cursor Pagination
type: "cursor"
cursorParam: "after"
limitParam: "first"
resultsPath: "$response.body#/data/users/edges"
cursorPath: "$response.body#/data/users/pageInfo/endCursor"
maxEntriesAllowedInPage: 50
Example 3: Header-based Total Pages
type: "PaginationWithTotalPages"
pageOffsetParam: "Page"
totalPages: "$response.header.pagination#/TotalPages"
resultsPath: "$response.body#/contacts"
maxEntriesAllowedInPage: 100
Example 4: Offset-based Pagination
type: "offset"
offsetParam: "skip"
limitParam: "take"
resultsPath: "$response.body#/results"
maxEntriesAllowedInPage: 100
Example 5: Next Page URL with Complex Parameter
type: "nextPageUrl"
nextUrlPath: "$response.body#/links/next"
cursorParam: "page[after]"
limitParam: "pageSize"
resultsPath: "$response.body#/data"
maxEntriesAllowedInPage: 50
Best Practices & Troubleshooting
Best Practices
- Always specify
resultsPath: Ensures consistent result extraction - Set appropriate
maxEntriesAllowedInPage: Balance between API limits and request efficiency - Test pagination termination: Ensure the termination condition works correctly for your API
- Use header pagination when available: Header-based pagination metadata is more efficient than parsing body
- Choose the right pagination type: Use the decision guide to select the most efficient type for your API
Common Issues
| Issue | Cause | Solution |
|---|---|---|
| Infinite Loop | Termination condition never met | Verify maxEntriesAllowedInPage or totalPages path is correct |
| Missing Results | Incorrect resultsPath | Log response structure and verify path matches actual response |
| Wrong Page Values | Parameter names don't match API | Check API documentation for correct parameter names |
| Header Parsing Errors | Missing # separator for JSON headers | Use # when headers contain JSON: $response.header.X#/field |
| Premature Termination | maxEntriesAllowedInPage too high | Set to actual page size returned by API |
Debugging Tips
- Log the
parameter_mapfor each request to verify parameters are correct - Log the response structure to verify
resultsPathcorrectness - Check that termination conditions are met after the last page
- For cursor pagination, log cursor values to ensure they're being extracted correctly
- For next URL pagination, log parsed URLs to verify correct cursor extraction
Implementation Details
PaginationHandler Base Class:
All pagination handlers extend the PaginationHandler abstract base class, which provides:
get_parameter_map(): Returns query parameters for the current page requestshould_terminate_loop(): Determines if pagination should stopget_next_page(): Updates internal state for next iterationupdate_response(): Updates the response object after each requestget_results(): Extracts results from response usingresultsPath
Factory Pattern:
The PaginationFactory class instantiates the appropriate handler:
pagination_handler = PaginationFactory.get_pagination_handler(pagination_object)
Version History
- v1.0: Initial implementation with 6 pagination types
- Supports JSON and XML payloads
- Runtime expression parsing for flexible path extraction